110 research outputs found
Searching for network modules
When analyzing complex networks a key target is to uncover their modular
structure, which means searching for a family of modules, namely node subsets
spanning each a subnetwork more densely connected than the average. This work
proposes a novel type of objective function for graph clustering, in the form
of a multilinear polynomial whose coefficients are determined by network
topology. It may be thought of as a potential function, to be maximized, taking
its values on fuzzy clusterings or families of fuzzy subsets of nodes over
which every node distributes a unit membership. When suitably parametrized,
this potential is shown to attain its maximum when every node concentrates its
all unit membership on some module. The output thus is a partition, while the
original discrete optimization problem is turned into a continuous version
allowing to conceive alternative search strategies. The instance of the problem
being a pseudo-Boolean function assigning real-valued cluster scores to node
subsets, modularity maximization is employed to exemplify a so-called quadratic
form, in that the scores of singletons and pairs also fully determine the
scores of larger clusters, while the resulting multilinear polynomial potential
function has degree 2. After considering further quadratic instances, different
from modularity and obtained by interpreting network topology in alternative
manners, a greedy local-search strategy for the continuous framework is
analytically compared with an existing greedy agglomerative procedure for the
discrete case. Overlapping is finally discussed in terms of multiple runs, i.e.
several local searches with different initializations.Comment: 10 page
MINE: Module Identification in Networks
<p>Abstract</p> <p>Background</p> <p>Graphical models of network associations are useful for both visualizing and integrating multiple types of association data. Identifying modules, or groups of functionally related gene products, is an important challenge in analyzing biological networks. However, existing tools to identify modules are insufficient when applied to dense networks of experimentally derived interaction data. To address this problem, we have developed an agglomerative clustering method that is able to identify highly modular sets of gene products within highly interconnected molecular interaction networks.</p> <p>Results</p> <p>MINE outperforms MCODE, CFinder, NEMO, SPICi, and MCL in identifying non-exclusive, high modularity clusters when applied to the <it>C. elegans </it>protein-protein interaction network. The algorithm generally achieves superior geometric accuracy and modularity for annotated functional categories. In comparison with the most closely related algorithm, MCODE, the top clusters identified by MINE are consistently of higher density and MINE is less likely to designate overlapping modules as a single unit. MINE offers a high level of granularity with a small number of adjustable parameters, enabling users to fine-tune cluster results for input networks with differing topological properties.</p> <p>Conclusions</p> <p>MINE was created in response to the challenge of discovering high quality modules of gene products within highly interconnected biological networks. The algorithm allows a high degree of flexibility and user-customisation of results with few adjustable parameters. MINE outperforms several popular clustering algorithms in identifying modules with high modularity and obtains good overall recall and precision of functional annotations in protein-protein interaction networks from both <it>S. cerevisiae </it>and <it>C. elegans</it>.</p
Comparative analysis of thermophilic and mesophilic proteins using Protein Energy Networks
<p>Abstract</p> <p>Background</p> <p>Thermophilic proteins sustain themselves and function at higher temperatures. Despite their structural and functional similarities with their mesophilic homologues, they show enhanced stability. Various comparative studies at genomic, protein sequence and structure levels, and experimental works highlight the different factors and dominant interacting forces contributing to this increased stability.</p> <p>Methods</p> <p>In this comparative structure based study, we have used interaction energies between amino acids, to generate structure networks called as Protein Energy Networks (PENs). These PENs are used to compute network, sub-graph, and node specific parameters. These parameters are then compared between the thermophile-mesophile homologues.</p> <p>Results</p> <p>The results show an increased number of clusters and low energy cliques in thermophiles as the main contributing factors for their enhanced stability. Further more, we see an increase in the number of hubs in thermophiles. We also observe no community of electrostatic cliques forming in PENs.</p> <p>Conclusion</p> <p>In this study we were able to take an energy based network approach, to identify the factors responsible for enhanced stability of thermophiles, by comparative analysis. We were able to point out that the sub-graph parameters are the prominent contributing factors. The thermophiles have a better-packed hydrophobic core. We have also discussed how thermophiles, although increasing stability through higher connectivity retains conformational flexibility, from a cliques and communities perspective.</p
Interactive analysis of systems biology molecular expression data
Peer reviewedPublisher PD
Revisiting Date and Party Hubs: Novel Approaches to Role Assignment in Protein Interaction Networks
The idea of 'date' and 'party' hubs has been influential in the study of
protein-protein interaction networks. Date hubs display low co-expression with
their partners, whilst party hubs have high co-expression. It was proposed that
party hubs are local coordinators whereas date hubs are global connectors. Here
we show that the reported importance of date hubs to network connectivity can
in fact be attributed to a tiny subset of them. Crucially, these few, extremely
central, hubs do not display particularly low expression correlation,
undermining the idea of a link between this quantity and hub function. The
date/party distinction was originally motivated by an approximately bimodal
distribution of hub co-expression; we show that this feature is not always
robust to methodological changes. Additionally, topological properties of hubs
do not in general correlate with co-expression. Thus, we suggest that a
date/party dichotomy is not meaningful and it might be more useful to conceive
of roles for protein-protein interactions rather than individual proteins. We
find significant correlations between interaction centrality and the functional
similarity of the interacting proteins.Comment: 27 pages, 5 main figures, 4 supplementary figure
Cohesive versus Flexible Evolution of Functional Modules in Eukaryotes
Although functionally related proteins can be reliably predicted from phylogenetic profiles, many functional modules do not seem to evolve cohesively according to case studies and systematic analyses in prokaryotes. In this study we quantify the extent of evolutionary cohesiveness of functional modules in eukaryotes and probe the biological and methodological factors influencing our estimates. We have collected various datasets of protein complexes and pathways in Saccheromyces cerevisiae. We define orthologous groups on 34 eukaryotic genomes and measure the extent of cohesive evolution of sets of orthologous groups of which members constitute a known complex or pathway. Within this framework it appears that most functional modules evolve flexibly rather than cohesively. Even after correcting for uncertain module definitions and potentially problematic orthologous groups, only 46% of pathways and complexes evolve more cohesively than random modules. This flexibility seems partly coupled to the nature of the functional module because biochemical pathways are generally more cohesively evolving than complexes
Efficient and accurate greedy search methods for mining functional modules in protein interaction networks
<p>Abstract</p> <p>Background</p> <p>Most computational algorithms mainly focus on detecting highly connected subgraphs in PPI networks as protein complexes but ignore their inherent organization. Furthermore, many of these algorithms are computationally expensive. However, recent analysis indicates that experimentally detected protein complexes generally contain Core/attachment structures.</p> <p>Methods</p> <p>In this paper, a Greedy Search Method based on Core-Attachment structure (GSM-CA) is proposed. The GSM-CA method detects densely connected regions in large protein-protein interaction networks based on the edge weight and two criteria for determining core nodes and attachment nodes. The GSM-CA method improves the prediction accuracy compared to other similar module detection approaches, however it is computationally expensive. Many module detection approaches are based on the traditional hierarchical methods, which is also computationally inefficient because the hierarchical tree structure produced by these approaches cannot provide adequate information to identify whether a network belongs to a module structure or not. In order to speed up the computational process, the Greedy Search Method based on Fast Clustering (GSM-FC) is proposed in this work. The edge weight based GSM-FC method uses a greedy procedure to traverse all edges just once to separate the network into the suitable set of modules.</p> <p>Results</p> <p>The proposed methods are applied to the protein interaction network of S. cerevisiae. Experimental results indicate that many significant functional modules are detected, most of which match the known complexes. Results also demonstrate that the GSM-FC algorithm is faster and more accurate as compared to other competing algorithms.</p> <p>Conclusions</p> <p>Based on the new edge weight definition, the proposed algorithm takes advantages of the greedy search procedure to separate the network into the suitable set of modules. Experimental analysis shows that the identified modules are statistically significant. The algorithm can reduce the computational time significantly while keeping high prediction accuracy.</p
Clique-based data mining for related genes in a biomedical database
<p>Abstract</p> <p>Background</p> <p>Progress in the life sciences cannot be made without integrating biomedical knowledge on numerous genes in order to help formulate hypotheses on the genetic mechanisms behind various biological phenomena, including diseases. There is thus a strong need for a way to automatically and comprehensively search from biomedical databases for related genes, such as genes in the same families and genes encoding components of the same pathways. Here we address the extraction of related genes by searching for densely-connected subgraphs, which are modeled as cliques, in a biomedical relational graph.</p> <p>Results</p> <p>We constructed a graph whose nodes were gene or disease pages, and edges were the hyperlink connections between those pages in the Online Mendelian Inheritance in Man (OMIM) database. We obtained over 20,000 sets of related genes (called 'gene modules') by enumerating cliques computationally. The modules included genes in the same family, genes for proteins that form a complex, and genes for components of the same signaling pathway. The results of experiments using 'metabolic syndrome'-related gene modules show that the gene modules can be used to get a coherent holistic picture helpful for interpreting relations among genes.</p> <p>Conclusion</p> <p>We presented a data mining approach extracting related genes by enumerating cliques. The extracted gene sets provide a holistic picture useful for comprehending complex disease mechanisms.</p
Overlapping Community Discovery Methods: A Survey
The detection of overlapping communities is a challenging problem which is
gaining increasing interest in recent years because of the natural attitude of
individuals, observed in real-world networks, to participate in multiple groups
at the same time. This review gives a description of the main proposals in the
field. Besides the methods designed for static networks, some new approaches
that deal with the detection of overlapping communities in networks that change
over time, are described. Methods are classified with respect to the underlying
principles guiding them to obtain a network division in groups sharing part of
their nodes. For each of them we also report, when available, computational
complexity and web site address from which it is possible to download the
software implementing the method.Comment: 20 pages, Book Chapter, appears as Social networks: Analysis and Case
Studies, A. Gunduz-Oguducu and A. S. Etaner-Uyar eds, Lecture Notes in Social
Networks, pp. 105-125, Springer,201
- …